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What is claimed is: 

1. A computer assisted/implemented method for developing a classifier for classifying 
communications comprising the steps of: 

(a) presenting communications to a user for labeling as relevant or irrelevant, the 
communications being selected from groups of communications including: 

a training set group of communications, the training set group of communications being 
selected by a traditional active learning algorithm; 

a system-labeled set of communications previously labeled by the system; 

a test set group of communications, test set group of communications for testing the 
accuracy of a current state of the classifier being developed by the present method; 

a faulty set of communications suspected to be previously mis-labeled by the user; and a 
random set of communications previously labeled by the user; 

(b) developing a classifier for classifying communications based upon the 
relevant/irrelevant labels assigned by the user during the presenting step. 

2. The method of claim 1, wherein the presenting step includes the steps of: 

assessing a value that labeling a set of communications from each group will provide to 
the classifier being developed; and 

selecting a next group for labeling based upon the greatest respective value that will be 
provided to the classifier being developed from the assessing step. 
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3. A computer assisted/implemented method for developing a classifier for classifying 
communications (text, electronic, etc.) comprising the steps of: 

(a) presenting communications to a user for labeling as relevant or irrelevant, the 
communications being selected from groups of communications including: 

a training set group of communications, the training set group of communications being 
selected by a traditional active learning algorithm; 

a test set group of communications, test set group of communications for testing the 
accuracy of a current state of the classifier being developed by the present method; 
and 

a previously-labeled set of communications previously labeled by at least one of the user, 
the system and another user; 

(b) developing a classifier for classifying communications based upon the 
relevant/irrelevant labels assigned by the user during the presenting step. 

4. The method of claim 3, wherein the previously-labeled set of communications includes 
communications previously labeled by the user. 

5. The method of claim 4, wherein the previously-labeled set of communications includes 
communications suspected by the system to be possibly mis-labeled by user. 

6. The method of claim 3, wherein the previously-labeled set of communications includes 
communications previously labeled by the system. 
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7. The method of claim 3, wherein the previously-labeled set of communications includes 
communications previously labeled by a user and communications previously labeled by the 
system. 

8. The method of claim 3 wherein the presenting step includes the steps of: 

assessing a value that labeling a set of communications from each group will provide to 
the classifier being developed; and 

selecting a next group for labeling based upon the greatest respective value that will be 
provided to the classifier being developed from the assessing step. 

9. The method of claim 3 wherein the presenting step includes the steps of: 

assessing a value that labeling a set of communications from each group will provide to 
the classifier being developed; and 

selecting a next group for labeling based upon the achieving known performance bounds 
for the classifier. 

10. The method of claim 3 further comprising the step of developing an expression of 
labeling criteria in an interactive session with the user. 

11. The method of claim 10, wherein the interactive session includes the steps of posing 
hypothetical questions to the user regarding what type of information the user would consider 
relevant. 
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12. The method of claim 11, wherein the hypothetical questions elicit "yes", "no" and 
"unsure" responses from the user. 

13. The method of claim 11 wherein subsequent questions are based, at least in part, upon 
the answers given to previous questions. 

14. The method of claim 11 wherein the step of developing an expression of labeling 
criteria produces a criteria document. 

15. The method of claim 14 wherein the expression and/or the criteria document include 
a group of keywords and/or phrases for use by the system in automatically labeling 
communications. 

16. The method of claim 10 wherein the step of developing an expression of labeling 
criteria produces a criteria document. 

17. The method of claim 16 wherein the criteria document includes a list of items that are 
considered relevant and a list of things that are considered irrelevant. 

18. The method of claim 17, wherein the presenting step (a) includes the step of querying 
the user which item(s) influenced the label on a user-labeled communication. 
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19. The method of claim 10 wherein the expression and/or the criteria document include 
a group of keywords and/or phrases for use by the system in automatically labeling 
communications. 

20. The method of claims 10 wherein the interactive session is conducted prior to the 
presenting step (a). 

2 1. A computer assisted/implemented method for developing a classifier for classifying 
communications (text, electronic, etc.) comprising the steps of: 

(a) developing an expression of labeling criteria in an interactive session with the 

user; 

(b) presenting communications to a user for labeling as relevant or irrelevant; and 

(c) developing a classifier for classifying communications based upon the 
relevant/irrelevant labels assigned by the user during the presenting step; 

wherein at least one of the presenting step (b) and the developing step (c) use the 
expression of labeling criteria developed in the developing step (a). 

22. The method of claim 21, wherein the interactive session includes the steps of posing 
questions to the user regarding what type of information the user would consider relevant. 

23. The method of claim 22, wherein the questions elicit "yes", "no" and "unsure" 
responses from the user. 
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24. The method of claim 21 wherein subsequent questions are based, at least in part, upon 
the answers given to previous questions. 

25. The method of claim 21 wherein the questions are structured from several 
dimensional levels of relevance, including a first dimension of question segments on a topic, a 
second dimension of question segments on an aspect of the topic and a third dimension of 
question segments on a type of discussion. 

26. The method of claim 25, wherein: 

the first dimension of question segments on a topic include one or more of the following 
segments: a segment concerning a client's product and a segment concerning a client's 
competitors; 

the second dimension of question segments on a topic include one or more of the 
following segments: a segment concerning a feature of the first segment, a segment concerning 
the first segment itself, a segment concerning corporate activity of the first segment, a segment 
concerning price of the first segment, a segment concerning news of the first segment and a 
segment concerning advertising of the first segment; 

the third dimension of question segments on a topic include one or more of the following 
segments: a segment concerning a mention of the second dimension segment, a segment 
concerning a description of the second dimension segment, a segment concerning a usage 
statement about the second dimension segment, a segment concerning a brand comparison 
involving the second dimension segment, and a segment concerning an opinion about the second 
dimension segment. 
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27. The method of claim 21 wherein the step of developing an expression of labeling 
criteria produces a criteria document. 

28. The method of claim 27 wherein the criteria document includes a list of items that are 
considered relevant and a list of items that are considered irrelevant. 

29. The method of claim 28 wherein the criteria document includes a group of keywords 
for use by the system in automatically labeling communications. 

30. The method of claim 28, wherein the presenting step (b) includes the step of querying 
the user which items influenced the label on a user-labeled communication. 

31. The method of claim 21 wherein the expression of labeling criteria includes a group 
of keywords and/or phrases for use by the system in automatically labeling communications. 

32. The method of claim 31 wherein the group of keywords is also for use by the system 
in a step of gathering communications. 

33. A computer assisted/implemented method for developing a classifier for classifying 
communications (text, electronic, etc.) comprising the steps of: 

(a) defining a domain of communications on which the classifier is going to operate; 

(b) collecting a set of communications from the domain; 

(c) eliciting labeling communication criteria from a user; 
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(d) labeling, by the system, communications from the set of communications according, 
at least in part, to the labeling communication criteria elicited from the user; 

(e) labeling, by the user, communications from the set of communications; 

(f) building a communications classifier according to a combination of labels applied to 
communications in labeling steps (d) and (e). 

34. The computer implemented method of claim 33, wherein the combination of the 
labeling steps (d) and (e), and the building step (f) includes the step of selecting communications 
for labeling by the user targeted to build the communications classifier within known 
performance bounds. 

35. The computer implemented method of claim 34, wherein the selecting step selects 
communications from groups of communications including: 

a training set group of communications, the training set group of communications being 
selected by a traditional active learning algorithm; 

a test set group of communications, test set group of communications for testing the 
accuracy of a current state of the classifier being developed by the present method; and 

a previously-labeled set of communications previously labeled by at least one of the user, 
the system and another user. 

36. The computer implemented method of claim 34, wherein the selecting step selects 
communications from groups of communications including: 
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a training set group of communications, the training set group of communications being 
selected by a traditional active learning algorithm; 

a system-labeled set of communications previously labeled by the system; 

a test set group of communications, test set group of communications for testing the 
accuracy of a current state of the classifier being developed by the present method; 

a faulty set of communications suspected to be previously mis-labeled by the user; and 

a random set of communications previously labeled by the user. 

37. The computer implemented method of claim 33, wherein the communication criteria 
elicited in the eliciting step (c) is used, in part, to determine communications to collect in the 
collecting step (b). 

38. The computer implemented method of claim 37, wherein the eliciting step (c) 
involves an interactive session with the user. 

39. The computer implemented method of claims 37, wherein the communication criteria 
elicited in the eliciting step (c) is used, in part, by the system to label communications in the 
labeling step (d). 

40. The computer implemented method of claim 39, wherein the eliciting step (c) 
involves an interactive session with the user. 
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41. The method of claim 33, wherein the building step (f) involves an active learning 
process. 

42. The computer implemented method of claims 33, wherein the communication criteria 
elicited in the eliciting step (c) is used, in part, by the system to label communications in the 
labeling step (d). 

43. The computer implemented method of claim 33, wherein the eliciting step (c) 
involves an interactive session with the user. 

44. The method of claim 43, wherein the interactive session includes the steps of 
posing questions to the user regarding what type of infonnation the user would consider relevant. 

45. The method of claim 44, wherein the interactive session also allows the user to 
provide keywords based upon a criteria the user considers relevant. 

47. The method of claim 44, wherein the questions elicit "yes", "no" and "unsure" 
responses from the user. 

48. The method of claim 43, wherein the building step (f) involves an active learning 
process. 
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49. A computer memory containing a software program including instructions for 
implementing a method for developing a classifier for classifying communications (text, 
electronic, etc.) comprising the steps of: 

(a) presenting communications to a user for labeling as relevant or irrelevant, the 
communications being selected from groups of communications including: 

a training set group of communications, the training set group of communications being 
selected by a traditional active learning algorithm; 

a test set group of communications, test set group of communications for testing the 
accuracy of a current state of the classifier being developed by the present method; and 

a previously-labeled set of communications previously labeled by at least one of the user, 
the system and another user; 

(b) developing a classifier for classifying communications based upon the 
relevant/irrelevant labels assigned by the user during the presenting step. 

50. A computer memory containing a software program including instructions for 
implementing a method for developing a classifier for classifying communications (text, 
electronic, etc.) comprising the steps of: 

(a) developing an expression of labeling criteria in an interactive session with the 

user; 

(b) presenting communications to a user for labeling as relevant or irrelevant; and 

(c) developing a classifier for classifying co~unications based upon the 
relevant/irrelevant labels assigned by the user during the presenting step; 
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wherein at least one of the presenting step (b) and the developing step (c) use the 
expression of labeling criteria developed in the developing step (a). 

51. A computer system memory containing a software program including instructions for 
implementing a method for developing a classifier for classifying communications (text, 
electronic, etc.) comprising the steps of: 

(a) defining a domain of communications on which the classifier is going to operate; 

(b) collecting a set of communications from the domain; 

(c) eliciting labeling communication criteria from a user; 

(d) labeling, by the computer system, communications from the set of communications 
according, at least in part, to the labeling communication criteria elicited from the user; 

(e) labeling, by the user, communications from the set of communications; 

(f) building a communications classifier according to a combination of labels 
applied to communications in labeling steps (d) and (e). 
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